High-performance K-means Implementation based on a Coarse-grained Map-Reduce Architecture
نویسندگان
چکیده
The k-means algorithm is one of the most common clustering algorithms and widely used in data mining and pattern recognition. The increasing computational requirement of big data applications makes hardware acceleration for the kmeans algorithm necessary. In this paper, a coarse-grained Map-Reduce architecture is proposed to implement the kmeans algorithm on an FPGA. Algorithmic segmentation, data path elaboration and automatic control are applied to optimize the architecture for high performance. In addition, high level synthesis technique is utilized to reduce development cycles and complexity. For a single iteration in the k-means algorithm, a throughput of 28.74 Gbps is achieved. The performance shows at least 3.93x speedup compared with four representative existing FPGA-based implementations and can satisfy the demand of big data applications.
منابع مشابه
Using the KressArray for Reconfigurable Computing
Multimedia applications commonly require high computation power mostly in conjunction with high data throughput. As an additional challenge, such applications are increasingly used in handheld devices, where also small package outlines and low power aspects are important. Many research approaches have shown, that accelerators based on reconfigurable hardware can satisfy those performance demand...
متن کاملReconfigurable Multi-Array Architecture for Low- Power and High-Speed Embedded Systems
Coarse-grained reconfigurable architecture (CGRA) based embedded systems aims to achieve high system performance with sufficient flexibility to map a variety of applications. However, the CGRA has been considered as prohibitive one due to its significant area/power overhead and performance bottleneck. In this work, I propose reconfigurable multi-array architecture to reduce power/area and enhan...
متن کاملCoarse Grained Reconfigurable Array Based Architecture for Low Power Real-Time Seizure Detection
There is increasing research and commercial interest in miniature on-body and implantable devices for continuous real-time biosignal monitoring. A key challenge in realizing this vision is in implementation of biosignal processing algorithms with acceptably low energy consumption. In this article, we investigate implementation of the REACT algorithm for real-time epileptic seizure detection on ...
متن کاملAn energy-efficient coarse grained spatial architecture for convolutional neural networks AlexNet
In this paper, we propose a CGSA (Coarse Grained Spatial Architecture) which processes different kinds of convolution with high performance and low energy consumption. The architecture’s 16 coarse grained parallel processing units achieve a peak 152 GOPS running at 500MHz by exploiting local data reuse of image data, feature map data and filter weights. It achieves 99 frames/s on the convolutio...
متن کاملAn improved joint model: POS tagging and dependency parsing
Dependency parsing is a way of syntactic parsing and a natural language that automatically analyzes the dependency structure of sentences, and the input for each sentence creates a dependency graph. Part-Of-Speech (POS) tagging is a prerequisite for dependency parsing. Generally, dependency parsers do the POS tagging task along with dependency parsing in a pipeline mode. Unfortunately, in pipel...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1610.05601 شماره
صفحات -
تاریخ انتشار 2016